Overview

Dataset statistics

Number of variables36
Number of observations311
Missing cells215
Missing cells (%)1.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory402.5 KiB
Average record size in memory1.3 KiB

Variable types

Categorical25
Numeric10
Boolean1

Alerts

Employee_Name has a high cardinality: 311 distinct valuesHigh cardinality
DOB has a high cardinality: 307 distinct valuesHigh cardinality
DateofHire has a high cardinality: 101 distinct valuesHigh cardinality
DateofTermination has a high cardinality: 96 distinct valuesHigh cardinality
LastPerformanceReview_Date has a high cardinality: 137 distinct valuesHigh cardinality
EmpID is highly overall correlated with PerfScoreID and 4 other fieldsHigh correlation
DeptID is highly overall correlated with Salary and 8 other fieldsHigh correlation
Salary is highly overall correlated with DeptID and 5 other fieldsHigh correlation
PositionID is highly overall correlated with DeptID and 7 other fieldsHigh correlation
Zip is highly overall correlated with DeptID and 5 other fieldsHigh correlation
ManagerID is highly overall correlated with DeptID and 5 other fieldsHigh correlation
EngagementSurvey is highly overall correlated with EmpID and 5 other fieldsHigh correlation
SpecialProjectsCount is highly overall correlated with DeptID and 7 other fieldsHigh correlation
DaysLateLast30 is highly overall correlated with EmpID and 5 other fieldsHigh correlation
MarriedID is highly overall correlated with MaritalStatusID and 2 other fieldsHigh correlation
MaritalStatusID is highly overall correlated with MarriedID and 1 other fieldsHigh correlation
GenderID is highly overall correlated with SexHigh correlation
EmpStatusID is highly overall correlated with Termd and 2 other fieldsHigh correlation
PerfScoreID is highly overall correlated with EmpID and 4 other fieldsHigh correlation
FromDiversityJobFairID is highly overall correlated with RaceDesc and 1 other fieldsHigh correlation
Termd is highly overall correlated with EmpStatusID and 2 other fieldsHigh correlation
Position is highly overall correlated with DeptID and 10 other fieldsHigh correlation
State is highly overall correlated with DeptID and 9 other fieldsHigh correlation
Sex is highly overall correlated with GenderIDHigh correlation
MaritalDesc is highly overall correlated with MarriedID and 1 other fieldsHigh correlation
HispanicLatino is highly overall correlated with Position and 5 other fieldsHigh correlation
RaceDesc is highly overall correlated with FromDiversityJobFairID and 3 other fieldsHigh correlation
DateofTermination is highly overall correlated with MarriedID and 8 other fieldsHigh correlation
TermReason is highly overall correlated with EmpStatusID and 4 other fieldsHigh correlation
EmploymentStatus is highly overall correlated with EmpStatusID and 2 other fieldsHigh correlation
Department is highly overall correlated with DeptID and 9 other fieldsHigh correlation
ManagerName is highly overall correlated with DeptID and 10 other fieldsHigh correlation
RecruitmentSource is highly overall correlated with FromDiversityJobFairID and 3 other fieldsHigh correlation
PerformanceScore is highly overall correlated with EmpID and 3 other fieldsHigh correlation
EmpSatisfaction is highly overall correlated with EmpID and 3 other fieldsHigh correlation
Absences is highly overall correlated with DateofTerminationHigh correlation
DateofTermination has 207 (66.6%) missing valuesMissing
ManagerID has 8 (2.6%) missing valuesMissing
Employee_Name is uniformly distributedUniform
EmpID is uniformly distributedUniform
DOB is uniformly distributedUniform
DateofTermination is uniformly distributedUniform
Employee_Name has unique valuesUnique
EmpID has unique valuesUnique
SpecialProjectsCount has 241 (77.5%) zerosZeros
DaysLateLast30 has 278 (89.4%) zerosZeros

Reproduction

Analysis started2022-12-17 23:45:29.657325
Analysis finished2022-12-17 23:46:07.508932
Duration37.85 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

Employee_Name
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct311
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size21.9 KiB
Adinolfi, Wilson K
 
1
O'hare, Lynn
 
1
Patronick, Lucas
 
1
Panjwani, Nina
 
1
Ozark, Travis
 
1
Other values (306)
306 

Length

Max length25
Median length22
Mean length14.755627
Min length8

Characters and Unicode

Total characters4589
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique311 ?
Unique (%)100.0%

Sample

1st rowAdinolfi, Wilson K
2nd rowAit Sidi, Karthikeyan
3rd rowAkinkuolie, Sarah
4th rowAlagbe,Trina
5th rowAnderson, Carol

Common Values

ValueCountFrequency (%)
Adinolfi, Wilson K 1
 
0.3%
O'hare, Lynn 1
 
0.3%
Patronick, Lucas 1
 
0.3%
Panjwani, Nina 1
 
0.3%
Ozark, Travis 1
 
0.3%
Owad, Clinton 1
 
0.3%
Osturnka, Adeel 1
 
0.3%
Onque, Jasmine 1
 
0.3%
Oliver, Brooke 1
 
0.3%
Nowlan, Kristie 1
 
0.3%
Other values (301) 301
96.8%

Length

2022-12-18T00:46:07.658328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
smith 5
 
0.8%
susan 4
 
0.6%
michael 4
 
0.6%
linda 3
 
0.5%
robinson 3
 
0.5%
j 3
 
0.5%
amy 3
 
0.5%
jennifer 3
 
0.5%
lisa 3
 
0.5%
barbara 3
 
0.5%
Other values (556) 601
94.6%

Most occurring characters

ValueCountFrequency (%)
448
 
9.8%
a 405
 
8.8%
e 376
 
8.2%
n 347
 
7.6%
, 311
 
6.8%
i 285
 
6.2%
r 256
 
5.6%
o 234
 
5.1%
l 206
 
4.5%
s 144
 
3.1%
Other values (45) 1577
34.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3175
69.2%
Uppercase Letter 649
 
14.1%
Space Separator 448
 
9.8%
Other Punctuation 314
 
6.8%
Dash Punctuation 3
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 405
12.8%
e 376
11.8%
n 347
10.9%
i 285
9.0%
r 256
 
8.1%
o 234
 
7.4%
l 206
 
6.5%
s 144
 
4.5%
t 141
 
4.4%
h 111
 
3.5%
Other values (16) 670
21.1%
Uppercase Letter
ValueCountFrequency (%)
S 59
 
9.1%
M 53
 
8.2%
B 52
 
8.0%
J 50
 
7.7%
C 48
 
7.4%
L 45
 
6.9%
A 37
 
5.7%
R 34
 
5.2%
D 33
 
5.1%
K 30
 
4.6%
Other values (15) 208
32.0%
Other Punctuation
ValueCountFrequency (%)
, 311
99.0%
' 3
 
1.0%
Space Separator
ValueCountFrequency (%)
448
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3824
83.3%
Common 765
 
16.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 405
 
10.6%
e 376
 
9.8%
n 347
 
9.1%
i 285
 
7.5%
r 256
 
6.7%
o 234
 
6.1%
l 206
 
5.4%
s 144
 
3.8%
t 141
 
3.7%
h 111
 
2.9%
Other values (41) 1319
34.5%
Common
ValueCountFrequency (%)
448
58.6%
, 311
40.7%
' 3
 
0.4%
- 3
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4589
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
448
 
9.8%
a 405
 
8.8%
e 376
 
8.2%
n 347
 
7.6%
, 311
 
6.8%
i 285
 
6.2%
r 256
 
5.6%
o 234
 
5.1%
l 206
 
4.5%
s 144
 
3.1%
Other values (45) 1577
34.4%

EmpID
Real number (ℝ)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct311
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10156
Minimum10001
Maximum10311
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:07.857109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum10001
5-th percentile10016.5
Q110078.5
median10156
Q310233.5
95-th percentile10295.5
Maximum10311
Range310
Interquartile range (IQR)155

Descriptive statistics

Standard deviation89.922189
Coefficient of variation (CV)0.008854095
Kurtosis-1.2
Mean10156
Median Absolute Deviation (MAD)78
Skewness0
Sum3158516
Variance8086
MonotonicityNot monotonic
2022-12-18T00:46:08.044586image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10026 1
 
0.3%
10303 1
 
0.3%
10005 1
 
0.3%
10148 1
 
0.3%
10041 1
 
0.3%
10281 1
 
0.3%
10021 1
 
0.3%
10121 1
 
0.3%
10078 1
 
0.3%
10104 1
 
0.3%
Other values (301) 301
96.8%
ValueCountFrequency (%)
10001 1
0.3%
10002 1
0.3%
10003 1
0.3%
10004 1
0.3%
10005 1
0.3%
10006 1
0.3%
10007 1
0.3%
10008 1
0.3%
10009 1
0.3%
10010 1
0.3%
ValueCountFrequency (%)
10311 1
0.3%
10310 1
0.3%
10309 1
0.3%
10308 1
0.3%
10307 1
0.3%
10306 1
0.3%
10305 1
0.3%
10304 1
0.3%
10303 1
0.3%
10302 1
0.3%

MarriedID
Categorical

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
0
187 
1
124 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters311
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 187
60.1%
1 124
39.9%

Length

2022-12-18T00:46:08.206445image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:08.349731image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0 187
60.1%
1 124
39.9%

Most occurring characters

ValueCountFrequency (%)
0 187
60.1%
1 124
39.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 311
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 187
60.1%
1 124
39.9%

Most occurring scripts

ValueCountFrequency (%)
Common 311
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 187
60.1%
1 124
39.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 187
60.1%
1 124
39.9%

MaritalStatusID
Categorical

Distinct5
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
0
137 
1
124 
2
30 
3
 
12
4
 
8

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters311
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
0 137
44.1%
1 124
39.9%
2 30
 
9.6%
3 12
 
3.9%
4 8
 
2.6%

Length

2022-12-18T00:46:08.467330image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:08.619477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0 137
44.1%
1 124
39.9%
2 30
 
9.6%
3 12
 
3.9%
4 8
 
2.6%

Most occurring characters

ValueCountFrequency (%)
0 137
44.1%
1 124
39.9%
2 30
 
9.6%
3 12
 
3.9%
4 8
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 311
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 137
44.1%
1 124
39.9%
2 30
 
9.6%
3 12
 
3.9%
4 8
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common 311
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 137
44.1%
1 124
39.9%
2 30
 
9.6%
3 12
 
3.9%
4 8
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 137
44.1%
1 124
39.9%
2 30
 
9.6%
3 12
 
3.9%
4 8
 
2.6%

GenderID
Categorical

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
0
176 
1
135 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters311
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 176
56.6%
1 135
43.4%

Length

2022-12-18T00:46:08.767308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:08.886313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0 176
56.6%
1 135
43.4%

Most occurring characters

ValueCountFrequency (%)
0 176
56.6%
1 135
43.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 311
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 176
56.6%
1 135
43.4%

Most occurring scripts

ValueCountFrequency (%)
Common 311
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 176
56.6%
1 135
43.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 176
56.6%
1 135
43.4%

EmpStatusID
Categorical

Distinct5
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
1
184 
5
88 
3
 
14
4
 
14
2
 
11

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters311
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row5
3rd row5
4th row1
5th row5

Common Values

ValueCountFrequency (%)
1 184
59.2%
5 88
28.3%
3 14
 
4.5%
4 14
 
4.5%
2 11
 
3.5%

Length

2022-12-18T00:46:09.030313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:09.203699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1 184
59.2%
5 88
28.3%
3 14
 
4.5%
4 14
 
4.5%
2 11
 
3.5%

Most occurring characters

ValueCountFrequency (%)
1 184
59.2%
5 88
28.3%
3 14
 
4.5%
4 14
 
4.5%
2 11
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 311
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 184
59.2%
5 88
28.3%
3 14
 
4.5%
4 14
 
4.5%
2 11
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Common 311
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 184
59.2%
5 88
28.3%
3 14
 
4.5%
4 14
 
4.5%
2 11
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 184
59.2%
5 88
28.3%
3 14
 
4.5%
4 14
 
4.5%
2 11
 
3.5%

DeptID
Real number (ℝ)

Distinct6
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6109325
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:09.303701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median5
Q35
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.0834872
Coefficient of variation (CV)0.23498224
Kurtosis2.2414339
Mean4.6109325
Median Absolute Deviation (MAD)0
Skewness-1.5363915
Sum1434
Variance1.1739446
MonotonicityNot monotonic
2022-12-18T00:46:09.420385image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
5 208
66.9%
3 50
 
16.1%
6 32
 
10.3%
4 10
 
3.2%
1 10
 
3.2%
2 1
 
0.3%
ValueCountFrequency (%)
1 10
 
3.2%
2 1
 
0.3%
3 50
 
16.1%
4 10
 
3.2%
5 208
66.9%
6 32
 
10.3%
ValueCountFrequency (%)
6 32
 
10.3%
5 208
66.9%
4 10
 
3.2%
3 50
 
16.1%
2 1
 
0.3%
1 10
 
3.2%

PerfScoreID
Categorical

Distinct4
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
3
243 
4
37 
2
 
18
1
 
13

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters311
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
3 243
78.1%
4 37
 
11.9%
2 18
 
5.8%
1 13
 
4.2%

Length

2022-12-18T00:46:09.563409image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:09.734708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
3 243
78.1%
4 37
 
11.9%
2 18
 
5.8%
1 13
 
4.2%

Most occurring characters

ValueCountFrequency (%)
3 243
78.1%
4 37
 
11.9%
2 18
 
5.8%
1 13
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 311
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 243
78.1%
4 37
 
11.9%
2 18
 
5.8%
1 13
 
4.2%

Most occurring scripts

ValueCountFrequency (%)
Common 311
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 243
78.1%
4 37
 
11.9%
2 18
 
5.8%
1 13
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 243
78.1%
4 37
 
11.9%
2 18
 
5.8%
1 13
 
4.2%
Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
0
282 
1
29 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters311
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 282
90.7%
1 29
 
9.3%

Length

2022-12-18T00:46:09.845558image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:09.955501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0 282
90.7%
1 29
 
9.3%

Most occurring characters

ValueCountFrequency (%)
0 282
90.7%
1 29
 
9.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 311
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 282
90.7%
1 29
 
9.3%

Most occurring scripts

ValueCountFrequency (%)
Common 311
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 282
90.7%
1 29
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 282
90.7%
1 29
 
9.3%

Salary
Real number (ℝ)

Distinct308
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69020.685
Minimum45046
Maximum250000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:10.093738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum45046
5-th percentile46999.5
Q155501.5
median62810
Q372036
95-th percentile108106.5
Maximum250000
Range204954
Interquartile range (IQR)16534.5

Descriptive statistics

Standard deviation25156.637
Coefficient of variation (CV)0.36447968
Kurtosis15.452149
Mean69020.685
Median Absolute Deviation (MAD)7982
Skewness3.3061808
Sum21465433
Variance6.3285638 × 108
MonotonicityNot monotonic
2022-12-18T00:46:10.253642image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63025 2
 
0.6%
57815 2
 
0.6%
61242 2
 
0.6%
66738 1
 
0.3%
68829 1
 
0.3%
53060 1
 
0.3%
47414 1
 
0.3%
63051 1
 
0.3%
71966 1
 
0.3%
52674 1
 
0.3%
Other values (298) 298
95.8%
ValueCountFrequency (%)
45046 1
0.3%
45069 1
0.3%
45115 1
0.3%
45395 1
0.3%
45433 1
0.3%
45998 1
0.3%
46120 1
0.3%
46335 1
0.3%
46428 1
0.3%
46430 1
0.3%
ValueCountFrequency (%)
250000 1
0.3%
220450 1
0.3%
180000 1
0.3%
178000 1
0.3%
170500 1
0.3%
157000 1
0.3%
150290 1
0.3%
148999 1
0.3%
140920 1
0.3%
138888 1
0.3%

Termd
Categorical

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
0
207 
1
104 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters311
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 207
66.6%
1 104
33.4%

Length

2022-12-18T00:46:10.464167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:10.634134image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0 207
66.6%
1 104
33.4%

Most occurring characters

ValueCountFrequency (%)
0 207
66.6%
1 104
33.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 311
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 207
66.6%
1 104
33.4%

Most occurring scripts

ValueCountFrequency (%)
Common 311
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 207
66.6%
1 104
33.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 207
66.6%
1 104
33.4%

PositionID
Real number (ℝ)

Distinct30
Distinct (%)9.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.845659
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:10.813133image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q118
median19
Q320
95-th percentile24
Maximum30
Range29
Interquartile range (IQR)2

Descriptive statistics

Standard deviation6.2234187
Coefficient of variation (CV)0.36943753
Kurtosis0.81234602
Mean16.845659
Median Absolute Deviation (MAD)1
Skewness-1.2316765
Sum5239
Variance38.730941
MonotonicityNot monotonic
2022-12-18T00:46:11.063284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
19 137
44.1%
20 57
18.3%
3 27
 
8.7%
18 13
 
4.2%
24 9
 
2.9%
14 8
 
2.6%
9 8
 
2.6%
15 5
 
1.6%
28 5
 
1.6%
8 5
 
1.6%
Other values (20) 37
 
11.9%
ValueCountFrequency (%)
1 3
 
1.0%
2 3
 
1.0%
3 27
8.7%
4 4
 
1.3%
5 1
 
0.3%
6 1
 
0.3%
7 1
 
0.3%
8 5
 
1.6%
9 8
 
2.6%
10 1
 
0.3%
ValueCountFrequency (%)
30 1
 
0.3%
29 1
 
0.3%
28 5
1.6%
27 2
 
0.6%
26 2
 
0.6%
25 1
 
0.3%
24 9
2.9%
23 2
 
0.6%
22 3
 
1.0%
21 3
 
1.0%

Position
Categorical

Distinct32
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size23.7 KiB
Production Technician I
137 
Production Technician II
57 
Area Sales Manager
27 
Production Manager
14 
Software Engineer
 
10
Other values (27)
66 

Length

Max length28
Median length24
Mean length20.720257
Min length3

Characters and Unicode

Total characters6444
Distinct characters38
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)4.5%

Sample

1st rowProduction Technician I
2nd rowSr. DBA
3rd rowProduction Technician II
4th rowProduction Technician I
5th rowProduction Technician I

Common Values

ValueCountFrequency (%)
Production Technician I 137
44.1%
Production Technician II 57
18.3%
Area Sales Manager 27
 
8.7%
Production Manager 14
 
4.5%
Software Engineer 10
 
3.2%
IT Support 8
 
2.6%
Data Analyst 7
 
2.3%
Sr. Network Engineer 5
 
1.6%
Database Administrator 5
 
1.6%
Network Engineer 5
 
1.6%
Other values (22) 36
 
11.6%

Length

2022-12-18T00:46:11.311160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
production 208
24.1%
technician 194
22.5%
i 140
16.2%
ii 57
 
6.6%
manager 50
 
5.8%
sales 31
 
3.6%
area 27
 
3.1%
engineer 20
 
2.3%
it 13
 
1.5%
software 11
 
1.3%
Other values (29) 113
13.1%

Most occurring characters

ValueCountFrequency (%)
n 726
11.3%
i 656
10.2%
c 618
 
9.6%
554
 
8.6%
o 473
 
7.3%
a 426
 
6.6%
e 412
 
6.4%
r 387
 
6.0%
t 306
 
4.7%
I 277
 
4.3%
Other values (28) 1609
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4931
76.5%
Uppercase Letter 945
 
14.7%
Space Separator 554
 
8.6%
Other Punctuation 10
 
0.2%
Dash Punctuation 4
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 726
14.7%
i 656
13.3%
c 618
12.5%
o 473
9.6%
a 426
8.6%
e 412
8.4%
r 387
7.8%
t 306
6.2%
u 222
 
4.5%
d 218
 
4.4%
Other values (12) 487
9.9%
Uppercase Letter
ValueCountFrequency (%)
I 277
29.3%
P 210
22.2%
T 207
21.9%
S 65
 
6.9%
A 56
 
5.9%
M 50
 
5.3%
D 30
 
3.2%
E 23
 
2.4%
B 12
 
1.3%
N 10
 
1.1%
Other values (2) 5
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 9
90.0%
& 1
 
10.0%
Space Separator
ValueCountFrequency (%)
554
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5876
91.2%
Common 568
 
8.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 726
12.4%
i 656
11.2%
c 618
10.5%
o 473
 
8.0%
a 426
 
7.2%
e 412
 
7.0%
r 387
 
6.6%
t 306
 
5.2%
I 277
 
4.7%
u 222
 
3.8%
Other values (24) 1373
23.4%
Common
ValueCountFrequency (%)
554
97.5%
. 9
 
1.6%
- 4
 
0.7%
& 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6444
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 726
11.3%
i 656
10.2%
c 618
 
9.6%
554
 
8.6%
o 473
 
7.3%
a 426
 
6.6%
e 412
 
6.4%
r 387
 
6.0%
t 306
 
4.7%
I 277
 
4.3%
Other values (28) 1609
25.0%

State
Categorical

Distinct28
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
MA
276 
CT
 
6
TX
 
3
VT
 
2
UT
 
1
Other values (23)
 
23

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters622
Distinct characters22
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)7.7%

Sample

1st rowMA
2nd rowMA
3rd rowMA
4th rowMA
5th rowMA

Common Values

ValueCountFrequency (%)
MA 276
88.7%
CT 6
 
1.9%
TX 3
 
1.0%
VT 2
 
0.6%
UT 1
 
0.3%
AZ 1
 
0.3%
ND 1
 
0.3%
OR 1
 
0.3%
MT 1
 
0.3%
NV 1
 
0.3%
Other values (18) 18
 
5.8%

Length

2022-12-18T00:46:11.523399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ma 276
88.7%
ct 6
 
1.9%
tx 3
 
1.0%
vt 2
 
0.6%
ny 1
 
0.3%
va 1
 
0.3%
al 1
 
0.3%
wa 1
 
0.3%
ca 1
 
0.3%
oh 1
 
0.3%
Other values (18) 18
 
5.8%

Most occurring characters

ValueCountFrequency (%)
A 283
45.5%
M 278
44.7%
T 14
 
2.3%
C 9
 
1.4%
N 7
 
1.1%
V 4
 
0.6%
I 3
 
0.5%
X 3
 
0.5%
O 3
 
0.5%
H 2
 
0.3%
Other values (12) 16
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 622
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 283
45.5%
M 278
44.7%
T 14
 
2.3%
C 9
 
1.4%
N 7
 
1.1%
V 4
 
0.6%
I 3
 
0.5%
X 3
 
0.5%
O 3
 
0.5%
H 2
 
0.3%
Other values (12) 16
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 622
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 283
45.5%
M 278
44.7%
T 14
 
2.3%
C 9
 
1.4%
N 7
 
1.1%
V 4
 
0.6%
I 3
 
0.5%
X 3
 
0.5%
O 3
 
0.5%
H 2
 
0.3%
Other values (12) 16
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 622
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 283
45.5%
M 278
44.7%
T 14
 
2.3%
C 9
 
1.4%
N 7
 
1.1%
V 4
 
0.6%
I 3
 
0.5%
X 3
 
0.5%
O 3
 
0.5%
H 2
 
0.3%
Other values (12) 16
 
2.6%

Zip
Real number (ℝ)

Distinct158
Distinct (%)50.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6555.4823
Minimum1013
Maximum98052
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:11.811236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1013
5-th percentile1725.5
Q11901.5
median2132
Q32355
95-th percentile38674.5
Maximum98052
Range97039
Interquartile range (IQR)453.5

Descriptive statistics

Standard deviation16908.397
Coefficient of variation (CV)2.5792758
Kurtosis16.187425
Mean6555.4823
Median Absolute Deviation (MAD)230
Skewness4.1054943
Sum2038755
Variance2.8589389 × 108
MonotonicityNot monotonic
2022-12-18T00:46:12.060530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1886 13
 
4.2%
1810 7
 
2.3%
2045 7
 
2.3%
2176 7
 
2.3%
2451 7
 
2.3%
2169 6
 
1.9%
2110 6
 
1.9%
2170 5
 
1.6%
2324 5
 
1.6%
1460 5
 
1.6%
Other values (148) 243
78.1%
ValueCountFrequency (%)
1013 1
 
0.3%
1040 1
 
0.3%
1420 2
 
0.6%
1450 2
 
0.6%
1460 5
1.6%
1545 1
 
0.3%
1550 1
 
0.3%
1701 2
 
0.6%
1721 1
 
0.3%
1730 2
 
0.6%
ValueCountFrequency (%)
98052 1
0.3%
97756 1
0.3%
90007 1
0.3%
89139 1
0.3%
85006 1
0.3%
84111 1
0.3%
83706 1
0.3%
80820 1
0.3%
78789 1
0.3%
78230 1
0.3%

DOB
Categorical

HIGH CARDINALITY
UNIFORM

Distinct307
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size19.9 KiB
09/09/65
 
2
06/14/87
 
2
09/22/76
 
2
07/07/84
 
2
05/12/80
 
1
Other values (302)
302 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters2488
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique303 ?
Unique (%)97.4%

Sample

1st row07/10/83
2nd row05/05/75
3rd row09/19/88
4th row09/27/88
5th row09/08/89

Common Values

ValueCountFrequency (%)
09/09/65 2
 
0.6%
06/14/87 2
 
0.6%
09/22/76 2
 
0.6%
07/07/84 2
 
0.6%
05/12/80 1
 
0.3%
11/06/84 1
 
0.3%
05/01/79 1
 
0.3%
05/19/82 1
 
0.3%
11/24/79 1
 
0.3%
12/11/76 1
 
0.3%
Other values (297) 297
95.5%

Length

2022-12-18T00:46:12.260099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
09/09/65 2
 
0.6%
09/22/76 2
 
0.6%
07/07/84 2
 
0.6%
06/14/87 2
 
0.6%
01/12/74 1
 
0.3%
09/27/88 1
 
0.3%
09/08/89 1
 
0.3%
05/22/77 1
 
0.3%
05/24/79 1
 
0.3%
02/18/83 1
 
0.3%
Other values (297) 297
95.5%

Most occurring characters

ValueCountFrequency (%)
/ 622
25.0%
0 426
17.1%
1 278
11.2%
8 249
10.0%
7 198
 
8.0%
2 180
 
7.2%
5 119
 
4.8%
9 116
 
4.7%
6 115
 
4.6%
4 95
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1866
75.0%
Other Punctuation 622
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 426
22.8%
1 278
14.9%
8 249
13.3%
7 198
10.6%
2 180
9.6%
5 119
 
6.4%
9 116
 
6.2%
6 115
 
6.2%
4 95
 
5.1%
3 90
 
4.8%
Other Punctuation
ValueCountFrequency (%)
/ 622
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2488
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 622
25.0%
0 426
17.1%
1 278
11.2%
8 249
10.0%
7 198
 
8.0%
2 180
 
7.2%
5 119
 
4.8%
9 116
 
4.7%
6 115
 
4.6%
4 95
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2488
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 622
25.0%
0 426
17.1%
1 278
11.2%
8 249
10.0%
7 198
 
8.0%
2 180
 
7.2%
5 119
 
4.8%
9 116
 
4.7%
6 115
 
4.6%
4 95
 
3.8%

Sex
Categorical

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size17.9 KiB
F
176 
M
135 

Length

Max length2
Median length1
Mean length1.4340836
Min length1

Characters and Unicode

Total characters446
Distinct characters3
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowF
4th rowF
5th rowF

Common Values

ValueCountFrequency (%)
F 176
56.6%
M 135
43.4%

Length

2022-12-18T00:46:12.430822image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:12.593428image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
f 176
56.6%
m 135
43.4%

Most occurring characters

ValueCountFrequency (%)
F 176
39.5%
M 135
30.3%
135
30.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 311
69.7%
Space Separator 135
30.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 176
56.6%
M 135
43.4%
Space Separator
ValueCountFrequency (%)
135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 311
69.7%
Common 135
30.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 176
56.6%
M 135
43.4%
Common
ValueCountFrequency (%)
135
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 446
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 176
39.5%
M 135
30.3%
135
30.3%

MaritalDesc
Categorical

Distinct5
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size19.5 KiB
Single
137 
Married
124 
Divorced
30 
Separated
 
12
Widowed
 
8

Length

Max length9
Median length8
Mean length6.733119
Min length6

Characters and Unicode

Total characters2094
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSingle
2nd rowMarried
3rd rowMarried
4th rowMarried
5th rowDivorced

Common Values

ValueCountFrequency (%)
Single 137
44.1%
Married 124
39.9%
Divorced 30
 
9.6%
Separated 12
 
3.9%
Widowed 8
 
2.6%

Length

2022-12-18T00:46:12.740165image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:12.921228image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
single 137
44.1%
married 124
39.9%
divorced 30
 
9.6%
separated 12
 
3.9%
widowed 8
 
2.6%

Most occurring characters

ValueCountFrequency (%)
e 323
15.4%
i 299
14.3%
r 290
13.8%
d 182
8.7%
S 149
7.1%
a 148
7.1%
l 137
6.5%
g 137
6.5%
n 137
6.5%
M 124
 
5.9%
Other values (8) 168
8.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1783
85.1%
Uppercase Letter 311
 
14.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 323
18.1%
i 299
16.8%
r 290
16.3%
d 182
10.2%
a 148
8.3%
l 137
7.7%
g 137
7.7%
n 137
7.7%
o 38
 
2.1%
v 30
 
1.7%
Other values (4) 62
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
S 149
47.9%
M 124
39.9%
D 30
 
9.6%
W 8
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 2094
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 323
15.4%
i 299
14.3%
r 290
13.8%
d 182
8.7%
S 149
7.1%
a 148
7.1%
l 137
6.5%
g 137
6.5%
n 137
6.5%
M 124
 
5.9%
Other values (8) 168
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2094
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 323
15.4%
i 299
14.3%
r 290
13.8%
d 182
8.7%
S 149
7.1%
a 148
7.1%
l 137
6.5%
g 137
6.5%
n 137
6.5%
M 124
 
5.9%
Other values (8) 168
8.0%

CitizenDesc
Categorical

Distinct3
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
US Citizen
295 
Eligible NonCitizen
 
12
Non-Citizen
 
4

Length

Max length19
Median length10
Mean length10.360129
Min length10

Characters and Unicode

Total characters3222
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS Citizen
2nd rowUS Citizen
3rd rowUS Citizen
4th rowUS Citizen
5th rowUS Citizen

Common Values

ValueCountFrequency (%)
US Citizen 295
94.9%
Eligible NonCitizen 12
 
3.9%
Non-Citizen 4
 
1.3%

Length

2022-12-18T00:46:13.077249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:13.239259image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
us 295
47.7%
citizen 295
47.7%
eligible 12
 
1.9%
noncitizen 12
 
1.9%
non-citizen 4
 
0.6%

Most occurring characters

ValueCountFrequency (%)
i 646
20.0%
n 327
10.1%
e 323
10.0%
C 311
9.7%
t 311
9.7%
z 311
9.7%
307
9.5%
U 295
9.2%
S 295
9.2%
l 24
 
0.7%
Other values (6) 72
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1982
61.5%
Uppercase Letter 929
28.8%
Space Separator 307
 
9.5%
Dash Punctuation 4
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 646
32.6%
n 327
16.5%
e 323
16.3%
t 311
15.7%
z 311
15.7%
l 24
 
1.2%
o 16
 
0.8%
g 12
 
0.6%
b 12
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
C 311
33.5%
U 295
31.8%
S 295
31.8%
N 16
 
1.7%
E 12
 
1.3%
Space Separator
ValueCountFrequency (%)
307
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2911
90.3%
Common 311
 
9.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 646
22.2%
n 327
11.2%
e 323
11.1%
C 311
10.7%
t 311
10.7%
z 311
10.7%
U 295
10.1%
S 295
10.1%
l 24
 
0.8%
N 16
 
0.5%
Other values (4) 52
 
1.8%
Common
ValueCountFrequency (%)
307
98.7%
- 4
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3222
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 646
20.0%
n 327
10.1%
e 323
10.0%
C 311
9.7%
t 311
9.7%
z 311
9.7%
307
9.5%
U 295
9.2%
S 295
9.2%
l 24
 
0.7%
Other values (6) 72
 
2.2%
Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size439.0 B
False
283 
True
 
28
ValueCountFrequency (%)
False 283
91.0%
True 28
 
9.0%
2022-12-18T00:46:13.379377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

RaceDesc
Categorical

Distinct6
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size20.7 KiB
White
187 
Black or African American
80 
Asian
29 
Two or more races
 
11
American Indian or Alaska Native
 
3

Length

Max length32
Median length5
Mean length10.839228
Min length5

Characters and Unicode

Total characters3371
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st rowWhite
2nd rowWhite
3rd rowWhite
4th rowWhite
5th rowWhite

Common Values

ValueCountFrequency (%)
White 187
60.1%
Black or African American 80
25.7%
Asian 29
 
9.3%
Two or more races 11
 
3.5%
American Indian or Alaska Native 3
 
1.0%
Hispanic 1
 
0.3%

Length

2022-12-18T00:46:13.497248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:13.672397image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
white 187
31.4%
or 94
15.8%
american 83
13.9%
black 80
13.4%
african 80
13.4%
asian 29
 
4.9%
two 11
 
1.8%
more 11
 
1.8%
races 11
 
1.8%
indian 3
 
0.5%
Other values (3) 7
 
1.2%

Most occurring characters

ValueCountFrequency (%)
i 387
11.5%
a 296
 
8.8%
e 295
 
8.8%
285
 
8.5%
r 279
 
8.3%
c 255
 
7.6%
n 199
 
5.9%
A 195
 
5.8%
t 190
 
5.6%
h 187
 
5.5%
Other values (16) 803
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2606
77.3%
Uppercase Letter 480
 
14.2%
Space Separator 285
 
8.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 387
14.9%
a 296
11.4%
e 295
11.3%
r 279
10.7%
c 255
9.8%
n 199
7.6%
t 190
7.3%
h 187
7.2%
o 116
 
4.5%
m 94
 
3.6%
Other values (8) 308
11.8%
Uppercase Letter
ValueCountFrequency (%)
A 195
40.6%
W 187
39.0%
B 80
16.7%
T 11
 
2.3%
I 3
 
0.6%
N 3
 
0.6%
H 1
 
0.2%
Space Separator
ValueCountFrequency (%)
285
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3086
91.5%
Common 285
 
8.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 387
12.5%
a 296
9.6%
e 295
9.6%
r 279
9.0%
c 255
8.3%
n 199
 
6.4%
A 195
 
6.3%
t 190
 
6.2%
h 187
 
6.1%
W 187
 
6.1%
Other values (15) 616
20.0%
Common
ValueCountFrequency (%)
285
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3371
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 387
11.5%
a 296
 
8.8%
e 295
 
8.8%
285
 
8.5%
r 279
 
8.3%
c 255
 
7.6%
n 199
 
5.9%
A 195
 
5.8%
t 190
 
5.6%
h 187
 
5.5%
Other values (16) 803
23.8%

DateofHire
Categorical

Distinct101
Distinct (%)32.5%
Missing0
Missing (%)0.0%
Memory size20.1 KiB
1/10/2011
 
14
3/30/2015
 
12
1/5/2015
 
11
9/29/2014
 
11
7/5/2011
 
10
Other values (96)
253 

Length

Max length10
Median length9
Mean length8.7009646
Min length8

Characters and Unicode

Total characters2706
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)17.4%

Sample

1st row7/5/2011
2nd row3/30/2015
3rd row7/5/2011
4th row1/7/2008
5th row7/11/2011

Common Values

ValueCountFrequency (%)
1/10/2011 14
 
4.5%
3/30/2015 12
 
3.9%
1/5/2015 11
 
3.5%
9/29/2014 11
 
3.5%
7/5/2011 10
 
3.2%
5/16/2011 10
 
3.2%
9/30/2013 9
 
2.9%
4/2/2012 9
 
2.9%
9/26/2011 9
 
2.9%
7/7/2014 9
 
2.9%
Other values (91) 207
66.6%

Length

2022-12-18T00:46:13.872035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1/10/2011 14
 
4.5%
3/30/2015 12
 
3.9%
1/5/2015 11
 
3.5%
9/29/2014 11
 
3.5%
7/5/2011 10
 
3.2%
5/16/2011 10
 
3.2%
9/26/2011 9
 
2.9%
7/7/2014 9
 
2.9%
7/8/2013 9
 
2.9%
4/2/2012 9
 
2.9%
Other values (91) 207
66.6%

Most occurring characters

ValueCountFrequency (%)
1 630
23.3%
/ 622
23.0%
2 463
17.1%
0 400
14.8%
5 115
 
4.2%
4 102
 
3.8%
3 101
 
3.7%
7 89
 
3.3%
9 74
 
2.7%
6 66
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2084
77.0%
Other Punctuation 622
 
23.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 630
30.2%
2 463
22.2%
0 400
19.2%
5 115
 
5.5%
4 102
 
4.9%
3 101
 
4.8%
7 89
 
4.3%
9 74
 
3.6%
6 66
 
3.2%
8 44
 
2.1%
Other Punctuation
ValueCountFrequency (%)
/ 622
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2706
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 630
23.3%
/ 622
23.0%
2 463
17.1%
0 400
14.8%
5 115
 
4.2%
4 102
 
3.8%
3 101
 
3.7%
7 89
 
3.3%
9 74
 
2.7%
6 66
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2706
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 630
23.3%
/ 622
23.0%
2 463
17.1%
0 400
14.8%
5 115
 
4.2%
4 102
 
3.8%
3 101
 
3.7%
7 89
 
3.3%
9 74
 
2.7%
6 66
 
2.4%

DateofTermination
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING
UNIFORM

Distinct96
Distinct (%)92.3%
Missing207
Missing (%)66.6%
Memory size13.3 KiB
9/7/2015
 
2
5/17/2016
 
2
11/4/2015
 
2
6/18/2013
 
2
4/1/2013
 
2
Other values (91)
94 

Length

Max length10
Median length9
Mean length8.7788462
Min length8

Characters and Unicode

Total characters913
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique88 ?
Unique (%)84.6%

Sample

1st row6/16/2016
2nd row9/24/2012
3rd row9/6/2016
4th row1/12/2017
5th row9/19/2016

Common Values

ValueCountFrequency (%)
9/7/2015 2
 
0.6%
5/17/2016 2
 
0.6%
11/4/2015 2
 
0.6%
6/18/2013 2
 
0.6%
4/1/2013 2
 
0.6%
4/4/2014 2
 
0.6%
8/19/2018 2
 
0.6%
9/24/2012 2
 
0.6%
1/12/2014 1
 
0.3%
8/19/2013 1
 
0.3%
Other values (86) 86
27.7%
(Missing) 207
66.6%

Length

2022-12-18T00:46:14.050784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
9/7/2015 2
 
1.9%
11/4/2015 2
 
1.9%
6/18/2013 2
 
1.9%
4/1/2013 2
 
1.9%
4/4/2014 2
 
1.9%
8/19/2018 2
 
1.9%
9/24/2012 2
 
1.9%
5/17/2016 2
 
1.9%
8/4/2017 1
 
1.0%
8/7/2014 1
 
1.0%
Other values (86) 86
82.7%

Most occurring characters

ValueCountFrequency (%)
/ 208
22.8%
1 187
20.5%
2 157
17.2%
0 114
12.5%
5 54
 
5.9%
6 43
 
4.7%
4 42
 
4.6%
8 32
 
3.5%
9 31
 
3.4%
3 24
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 705
77.2%
Other Punctuation 208
 
22.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 187
26.5%
2 157
22.3%
0 114
16.2%
5 54
 
7.7%
6 43
 
6.1%
4 42
 
6.0%
8 32
 
4.5%
9 31
 
4.4%
3 24
 
3.4%
7 21
 
3.0%
Other Punctuation
ValueCountFrequency (%)
/ 208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 913
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 208
22.8%
1 187
20.5%
2 157
17.2%
0 114
12.5%
5 54
 
5.9%
6 43
 
4.7%
4 42
 
4.6%
8 32
 
3.5%
9 31
 
3.4%
3 24
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 913
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 208
22.8%
1 187
20.5%
2 157
17.2%
0 114
12.5%
5 54
 
5.9%
6 43
 
4.7%
4 42
 
4.6%
8 32
 
3.5%
9 31
 
3.4%
3 24
 
2.6%

TermReason
Categorical

Distinct18
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Memory size22.2 KiB
N/A-StillEmployed
207 
Another position
 
20
unhappy
 
14
more money
 
11
career change
 
9
Other values (13)
50 

Length

Max length32
Median length17
Mean length15.546624
Min length5

Characters and Unicode

Total characters4835
Distinct characters30
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)1.0%

Sample

1st rowN/A-StillEmployed
2nd rowcareer change
3rd rowhours
4th rowN/A-StillEmployed
5th rowreturn to school

Common Values

ValueCountFrequency (%)
N/A-StillEmployed 207
66.6%
Another position 20
 
6.4%
unhappy 14
 
4.5%
more money 11
 
3.5%
career change 9
 
2.9%
hours 8
 
2.6%
attendance 7
 
2.3%
return to school 5
 
1.6%
relocation out of area 5
 
1.6%
no-call, no-show 4
 
1.3%
Other values (8) 21
 
6.8%

Length

2022-12-18T00:46:14.239440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
n/a-stillemployed 207
51.1%
position 20
 
4.9%
another 20
 
4.9%
unhappy 14
 
3.5%
more 11
 
2.7%
money 11
 
2.7%
career 9
 
2.2%
change 9
 
2.2%
return 8
 
2.0%
hours 8
 
2.0%
Other values (29) 88
21.7%

Most occurring characters

ValueCountFrequency (%)
l 650
 
13.4%
o 354
 
7.3%
e 339
 
7.0%
t 309
 
6.4%
i 283
 
5.9%
p 259
 
5.4%
m 244
 
5.0%
y 239
 
4.9%
A 227
 
4.7%
d 225
 
4.7%
Other values (20) 1706
35.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3462
71.6%
Uppercase Letter 850
 
17.6%
Dash Punctuation 218
 
4.5%
Other Punctuation 211
 
4.4%
Space Separator 94
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 650
18.8%
o 354
10.2%
e 339
9.8%
t 309
8.9%
i 283
8.2%
p 259
 
7.5%
m 244
 
7.0%
y 239
 
6.9%
d 225
 
6.5%
n 127
 
3.7%
Other values (10) 433
12.5%
Uppercase Letter
ValueCountFrequency (%)
A 227
26.7%
N 207
24.4%
S 207
24.4%
E 207
24.4%
L 1
 
0.1%
F 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
/ 207
98.1%
, 4
 
1.9%
Dash Punctuation
ValueCountFrequency (%)
- 218
100.0%
Space Separator
ValueCountFrequency (%)
94
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4312
89.2%
Common 523
 
10.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 650
15.1%
o 354
 
8.2%
e 339
 
7.9%
t 309
 
7.2%
i 283
 
6.6%
p 259
 
6.0%
m 244
 
5.7%
y 239
 
5.5%
A 227
 
5.3%
d 225
 
5.2%
Other values (16) 1183
27.4%
Common
ValueCountFrequency (%)
- 218
41.7%
/ 207
39.6%
94
18.0%
, 4
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4835
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 650
 
13.4%
o 354
 
7.3%
e 339
 
7.0%
t 309
 
6.4%
i 283
 
5.9%
p 259
 
5.4%
m 244
 
5.0%
y 239
 
4.9%
A 227
 
4.7%
d 225
 
4.7%
Other values (20) 1706
35.3%

EmploymentStatus
Categorical

Distinct3
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
Active
207 
Voluntarily Terminated
88 
Terminated for Cause
 
16

Length

Max length22
Median length6
Mean length11.247588
Min length6

Characters and Unicode

Total characters3498
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowActive
2nd rowVoluntarily Terminated
3rd rowVoluntarily Terminated
4th rowActive
5th rowVoluntarily Terminated

Common Values

ValueCountFrequency (%)
Active 207
66.6%
Voluntarily Terminated 88
28.3%
Terminated for Cause 16
 
5.1%

Length

2022-12-18T00:46:14.397498image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:14.830863image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
active 207
48.0%
terminated 104
24.1%
voluntarily 88
20.4%
for 16
 
3.7%
cause 16
 
3.7%

Most occurring characters

ValueCountFrequency (%)
e 431
12.3%
t 399
11.4%
i 399
11.4%
r 208
 
5.9%
a 208
 
5.9%
A 207
 
5.9%
c 207
 
5.9%
v 207
 
5.9%
n 192
 
5.5%
l 176
 
5.0%
Other values (11) 864
24.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2963
84.7%
Uppercase Letter 415
 
11.9%
Space Separator 120
 
3.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 431
14.5%
t 399
13.5%
i 399
13.5%
r 208
7.0%
a 208
7.0%
c 207
7.0%
v 207
7.0%
n 192
6.5%
l 176
5.9%
u 104
 
3.5%
Other values (6) 432
14.6%
Uppercase Letter
ValueCountFrequency (%)
A 207
49.9%
T 104
25.1%
V 88
21.2%
C 16
 
3.9%
Space Separator
ValueCountFrequency (%)
120
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3378
96.6%
Common 120
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 431
12.8%
t 399
11.8%
i 399
11.8%
r 208
 
6.2%
a 208
 
6.2%
A 207
 
6.1%
c 207
 
6.1%
v 207
 
6.1%
n 192
 
5.7%
l 176
 
5.2%
Other values (10) 744
22.0%
Common
ValueCountFrequency (%)
120
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3498
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 431
12.3%
t 399
11.4%
i 399
11.4%
r 208
 
5.9%
a 208
 
5.9%
A 207
 
5.9%
c 207
 
5.9%
v 207
 
5.9%
n 192
 
5.5%
l 176
 
5.0%
Other values (11) 864
24.7%

Department
Categorical

Distinct6
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size21.6 KiB
Production
209 
IT/IS
50 
Sales
31 
Software Engineering
 
11
Admin Offices
 
9

Length

Max length20
Median length17
Mean length13.861736
Min length5

Characters and Unicode

Total characters4311
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st rowProduction
2nd rowIT/IS
3rd rowProduction
4th rowProduction
5th rowProduction

Common Values

ValueCountFrequency (%)
Production 209
67.2%
IT/IS 50
 
16.1%
Sales 31
 
10.0%
Software Engineering 11
 
3.5%
Admin Offices 9
 
2.9%
Executive Office 1
 
0.3%

Length

2022-12-18T00:46:15.041950image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:15.315411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
production 209
63.0%
it/is 50
 
15.1%
sales 31
 
9.3%
software 11
 
3.3%
engineering 11
 
3.3%
admin 9
 
2.7%
offices 9
 
2.7%
executive 1
 
0.3%
office 1
 
0.3%

Most occurring characters

ValueCountFrequency (%)
1484
34.4%
o 429
 
10.0%
i 251
 
5.8%
n 251
 
5.8%
r 231
 
5.4%
t 221
 
5.1%
c 220
 
5.1%
d 218
 
5.1%
u 210
 
4.9%
P 209
 
4.8%
Other values (17) 587
 
13.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2295
53.2%
Space Separator 1484
34.4%
Uppercase Letter 482
 
11.2%
Other Punctuation 50
 
1.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 429
18.7%
i 251
10.9%
n 251
10.9%
r 231
10.1%
t 221
9.6%
c 220
9.6%
d 218
9.5%
u 210
9.2%
e 76
 
3.3%
a 42
 
1.8%
Other values (8) 146
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
P 209
43.4%
I 100
20.7%
S 92
19.1%
T 50
 
10.4%
E 12
 
2.5%
O 10
 
2.1%
A 9
 
1.9%
Space Separator
ValueCountFrequency (%)
1484
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 50
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2777
64.4%
Common 1534
35.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 429
15.4%
i 251
9.0%
n 251
9.0%
r 231
8.3%
t 221
8.0%
c 220
7.9%
d 218
7.9%
u 210
7.6%
P 209
7.5%
I 100
 
3.6%
Other values (15) 437
15.7%
Common
ValueCountFrequency (%)
1484
96.7%
/ 50
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1484
34.4%
o 429
 
10.0%
i 251
 
5.8%
n 251
 
5.8%
r 231
 
5.4%
t 221
 
5.1%
c 220
 
5.1%
d 218
 
5.1%
u 210
 
4.9%
P 209
 
4.8%
Other values (17) 587
 
13.6%

ManagerName
Categorical

Distinct21
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size21.3 KiB
Michael Albert
22 
Kissy Sullivan
22 
Elijiah Gray
22 
Kelley Spirea
22 
Brannon Miller
22 
Other values (16)
201 

Length

Max length18
Median length16
Mean length12.665595
Min length8

Characters and Unicode

Total characters3939
Distinct characters41
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMichael Albert
2nd rowSimon Roup
3rd rowKissy Sullivan
4th rowElijiah Gray
5th rowWebster Butler

Common Values

ValueCountFrequency (%)
Michael Albert 22
 
7.1%
Kissy Sullivan 22
 
7.1%
Elijiah Gray 22
 
7.1%
Kelley Spirea 22
 
7.1%
Brannon Miller 22
 
7.1%
Ketsia Liebig 21
 
6.8%
David Stanley 21
 
6.8%
Amy Dunn 21
 
6.8%
Webster Butler 21
 
6.8%
Janet King 19
 
6.1%
Other values (11) 98
31.5%

Length

2022-12-18T00:46:15.521094image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
michael 22
 
3.5%
kissy 22
 
3.5%
sullivan 22
 
3.5%
elijiah 22
 
3.5%
gray 22
 
3.5%
kelley 22
 
3.5%
spirea 22
 
3.5%
brannon 22
 
3.5%
miller 22
 
3.5%
albert 22
 
3.5%
Other values (34) 411
65.1%

Most occurring characters

ValueCountFrequency (%)
e 402
 
10.2%
n 327
 
8.3%
320
 
8.1%
i 320
 
8.1%
a 313
 
7.9%
l 280
 
7.1%
r 231
 
5.9%
t 186
 
4.7%
o 125
 
3.2%
y 121
 
3.1%
Other values (31) 1314
33.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2976
75.6%
Uppercase Letter 636
 
16.1%
Space Separator 320
 
8.1%
Other Punctuation 7
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 402
13.5%
n 327
11.0%
i 320
10.8%
a 313
10.5%
l 280
9.4%
r 231
7.8%
t 186
 
6.2%
o 125
 
4.2%
y 121
 
4.1%
u 101
 
3.4%
Other values (13) 570
19.2%
Uppercase Letter
ValueCountFrequency (%)
S 105
16.5%
K 84
13.2%
B 67
10.5%
D 64
10.1%
M 58
9.1%
A 52
8.2%
L 41
 
6.4%
J 40
 
6.3%
E 26
 
4.1%
R 24
 
3.8%
Other values (6) 75
11.8%
Space Separator
ValueCountFrequency (%)
320
100.0%
Other Punctuation
ValueCountFrequency (%)
. 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3612
91.7%
Common 327
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 402
 
11.1%
n 327
 
9.1%
i 320
 
8.9%
a 313
 
8.7%
l 280
 
7.8%
r 231
 
6.4%
t 186
 
5.1%
o 125
 
3.5%
y 121
 
3.3%
S 105
 
2.9%
Other values (29) 1202
33.3%
Common
ValueCountFrequency (%)
320
97.9%
. 7
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3939
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 402
 
10.2%
n 327
 
8.3%
320
 
8.1%
i 320
 
8.1%
a 313
 
7.9%
l 280
 
7.1%
r 231
 
5.9%
t 186
 
4.7%
o 125
 
3.2%
y 121
 
3.1%
Other values (31) 1314
33.4%

ManagerID
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct23
Distinct (%)7.6%
Missing8
Missing (%)2.6%
Infinite0
Infinite (%)0.0%
Mean14.570957
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:15.729218image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q110
median15
Q319
95-th percentile22
Maximum39
Range38
Interquartile range (IQR)9

Descriptive statistics

Standard deviation8.0783056
Coefficient of variation (CV)0.55441146
Kurtosis1.6084223
Mean14.570957
Median Absolute Deviation (MAD)4
Skewness0.75927123
Sum4415
Variance65.259021
MonotonicityNot monotonic
2022-12-18T00:46:15.962204image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
20 22
 
7.1%
16 22
 
7.1%
12 22
 
7.1%
18 22
 
7.1%
22 21
 
6.8%
11 21
 
6.8%
19 21
 
6.8%
14 21
 
6.8%
2 19
 
6.1%
4 17
 
5.5%
Other values (13) 95
30.5%
ValueCountFrequency (%)
1 6
 
1.9%
2 19
6.1%
3 1
 
0.3%
4 17
5.5%
5 7
 
2.3%
6 4
 
1.3%
7 14
4.5%
9 2
 
0.6%
10 9
2.9%
11 21
6.8%
ValueCountFrequency (%)
39 13
4.2%
30 1
 
0.3%
22 21
6.8%
21 13
4.2%
20 22
7.1%
19 21
6.8%
18 22
7.1%
17 14
4.5%
16 22
7.1%
15 3
 
1.0%
Distinct9
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
Indeed
87 
LinkedIn
76 
Google Search
49 
Employee Referral
31 
Diversity Job Fair
29 
Other values (4)
39 

Length

Max length23
Median length18
Mean length10.414791
Min length5

Characters and Unicode

Total characters3239
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st rowLinkedIn
2nd rowIndeed
3rd rowLinkedIn
4th rowIndeed
5th rowGoogle Search

Common Values

ValueCountFrequency (%)
Indeed 87
28.0%
LinkedIn 76
24.4%
Google Search 49
15.8%
Employee Referral 31
 
10.0%
Diversity Job Fair 29
 
9.3%
CareerBuilder 23
 
7.4%
Website 13
 
4.2%
Other 2
 
0.6%
On-line Web application 1
 
0.3%

Length

2022-12-18T00:46:16.139390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:16.357386image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
indeed 87
19.3%
linkedin 76
16.9%
google 49
10.9%
search 49
10.9%
employee 31
 
6.9%
referral 31
 
6.9%
diversity 29
 
6.4%
job 29
 
6.4%
fair 29
 
6.4%
careerbuilder 23
 
5.1%
Other values (5) 18
 
4.0%

Most occurring characters

ValueCountFrequency (%)
e 600
18.5%
d 273
 
8.4%
n 242
 
7.5%
r 240
 
7.4%
i 202
 
6.2%
I 163
 
5.0%
o 159
 
4.9%
140
 
4.3%
l 136
 
4.2%
a 134
 
4.1%
Other values (26) 950
29.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2549
78.7%
Uppercase Letter 549
 
16.9%
Space Separator 140
 
4.3%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 600
23.5%
d 273
10.7%
n 242
9.5%
r 240
 
9.4%
i 202
 
7.9%
o 159
 
6.2%
l 136
 
5.3%
a 134
 
5.3%
k 76
 
3.0%
y 60
 
2.4%
Other values (11) 427
16.8%
Uppercase Letter
ValueCountFrequency (%)
I 163
29.7%
L 76
13.8%
G 49
 
8.9%
S 49
 
8.9%
R 31
 
5.6%
E 31
 
5.6%
D 29
 
5.3%
J 29
 
5.3%
F 29
 
5.3%
C 23
 
4.2%
Other values (3) 40
 
7.3%
Space Separator
ValueCountFrequency (%)
140
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3098
95.6%
Common 141
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 600
19.4%
d 273
 
8.8%
n 242
 
7.8%
r 240
 
7.7%
i 202
 
6.5%
I 163
 
5.3%
o 159
 
5.1%
l 136
 
4.4%
a 134
 
4.3%
k 76
 
2.5%
Other values (24) 873
28.2%
Common
ValueCountFrequency (%)
140
99.3%
- 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3239
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 600
18.5%
d 273
 
8.4%
n 242
 
7.5%
r 240
 
7.4%
i 202
 
6.2%
I 163
 
5.0%
o 159
 
4.9%
140
 
4.3%
l 136
 
4.2%
a 134
 
4.1%
Other values (26) 950
29.3%

PerformanceScore
Categorical

Distinct4
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
Fully Meets
243 
Exceeds
37 
Needs Improvement
 
18
PIP
 
13

Length

Max length17
Median length11
Mean length10.536977
Min length3

Characters and Unicode

Total characters3277
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowExceeds
2nd rowFully Meets
3rd rowFully Meets
4th rowFully Meets
5th rowFully Meets

Common Values

ValueCountFrequency (%)
Fully Meets 243
78.1%
Exceeds 37
 
11.9%
Needs Improvement 18
 
5.8%
PIP 13
 
4.2%

Length

2022-12-18T00:46:16.553117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:16.702474image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
fully 243
42.5%
meets 243
42.5%
exceeds 37
 
6.5%
needs 18
 
3.1%
improvement 18
 
3.1%
pip 13
 
2.3%

Most occurring characters

ValueCountFrequency (%)
e 632
19.3%
l 486
14.8%
s 298
9.1%
261
8.0%
t 261
8.0%
F 243
 
7.4%
u 243
 
7.4%
y 243
 
7.4%
M 243
 
7.4%
d 55
 
1.7%
Other values (12) 312
9.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2418
73.8%
Uppercase Letter 598
 
18.2%
Space Separator 261
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 632
26.1%
l 486
20.1%
s 298
12.3%
t 261
10.8%
u 243
 
10.0%
y 243
 
10.0%
d 55
 
2.3%
c 37
 
1.5%
x 37
 
1.5%
m 36
 
1.5%
Other values (5) 90
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
F 243
40.6%
M 243
40.6%
E 37
 
6.2%
I 31
 
5.2%
P 26
 
4.3%
N 18
 
3.0%
Space Separator
ValueCountFrequency (%)
261
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3016
92.0%
Common 261
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 632
21.0%
l 486
16.1%
s 298
9.9%
t 261
8.7%
F 243
 
8.1%
u 243
 
8.1%
y 243
 
8.1%
M 243
 
8.1%
d 55
 
1.8%
c 37
 
1.2%
Other values (11) 275
9.1%
Common
ValueCountFrequency (%)
261
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3277
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 632
19.3%
l 486
14.8%
s 298
9.1%
261
8.0%
t 261
8.0%
F 243
 
7.4%
u 243
 
7.4%
y 243
 
7.4%
M 243
 
7.4%
d 55
 
1.7%
Other values (12) 312
9.5%

EngagementSurvey
Real number (ℝ)

Distinct119
Distinct (%)38.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.11
Minimum1.12
Maximum5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:16.858212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1.12
5-th percentile2.4
Q13.69
median4.28
Q34.7
95-th percentile5
Maximum5
Range3.88
Interquartile range (IQR)1.01

Descriptive statistics

Standard deviation0.78993752
Coefficient of variation (CV)0.19219891
Kurtosis1.1645598
Mean4.11
Median Absolute Deviation (MAD)0.49
Skewness-1.1169793
Sum1278.21
Variance0.62400129
MonotonicityNot monotonic
2022-12-18T00:46:17.044122image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 56
 
18.0%
4.5 19
 
6.1%
4.3 17
 
5.5%
4.2 17
 
5.5%
4.1 16
 
5.1%
4.6 10
 
3.2%
4.7 7
 
2.3%
4.4 7
 
2.3%
3.6 7
 
2.3%
3.4 5
 
1.6%
Other values (109) 150
48.2%
ValueCountFrequency (%)
1.12 1
 
0.3%
1.2 1
 
0.3%
1.56 1
 
0.3%
1.81 1
 
0.3%
1.93 1
 
0.3%
2 3
1.0%
2.1 1
 
0.3%
2.3 2
0.6%
2.33 1
 
0.3%
2.34 1
 
0.3%
ValueCountFrequency (%)
5 56
18.0%
4.96 2
 
0.6%
4.94 1
 
0.3%
4.9 1
 
0.3%
4.88 1
 
0.3%
4.84 1
 
0.3%
4.83 2
 
0.6%
4.81 1
 
0.3%
4.8 3
 
1.0%
4.78 1
 
0.3%

EmpSatisfaction
Categorical

Distinct5
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
3
108 
5
98 
4
94 
2
 
9
1
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters311
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5
2nd row3
3rd row3
4th row5
5th row4

Common Values

ValueCountFrequency (%)
3 108
34.7%
5 98
31.5%
4 94
30.2%
2 9
 
2.9%
1 2
 
0.6%

Length

2022-12-18T00:46:17.238402image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-18T00:46:17.466492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
3 108
34.7%
5 98
31.5%
4 94
30.2%
2 9
 
2.9%
1 2
 
0.6%

Most occurring characters

ValueCountFrequency (%)
3 108
34.7%
5 98
31.5%
4 94
30.2%
2 9
 
2.9%
1 2
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 311
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 108
34.7%
5 98
31.5%
4 94
30.2%
2 9
 
2.9%
1 2
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 311
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 108
34.7%
5 98
31.5%
4 94
30.2%
2 9
 
2.9%
1 2
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 108
34.7%
5 98
31.5%
4 94
30.2%
2 9
 
2.9%
1 2
 
0.6%

SpecialProjectsCount
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct9
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2186495
Minimum0
Maximum8
Zeros241
Zeros (%)77.5%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:17.663398image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile6
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.3494212
Coefficient of variation (CV)1.9278892
Kurtosis0.64141536
Mean1.2186495
Median Absolute Deviation (MAD)0
Skewness1.5392709
Sum379
Variance5.5197801
MonotonicityNot monotonic
2022-12-18T00:46:17.810767image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 241
77.5%
6 21
 
6.8%
5 21
 
6.8%
7 12
 
3.9%
4 9
 
2.9%
3 3
 
1.0%
8 2
 
0.6%
2 1
 
0.3%
1 1
 
0.3%
ValueCountFrequency (%)
0 241
77.5%
1 1
 
0.3%
2 1
 
0.3%
3 3
 
1.0%
4 9
 
2.9%
5 21
 
6.8%
6 21
 
6.8%
7 12
 
3.9%
8 2
 
0.6%
ValueCountFrequency (%)
8 2
 
0.6%
7 12
 
3.9%
6 21
 
6.8%
5 21
 
6.8%
4 9
 
2.9%
3 3
 
1.0%
2 1
 
0.3%
1 1
 
0.3%
0 241
77.5%
Distinct137
Distinct (%)44.1%
Missing0
Missing (%)0.0%
Memory size20.1 KiB
1/14/2019
 
18
2/18/2019
 
12
1/21/2019
 
10
1/28/2019
 
9
2/25/2019
 
9
Other values (132)
253 

Length

Max length10
Median length9
Mean length8.6848875
Min length8

Characters and Unicode

Total characters2701
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique89 ?
Unique (%)28.6%

Sample

1st row1/17/2019
2nd row2/24/2016
3rd row5/15/2012
4th row1/3/2019
5th row2/1/2016

Common Values

ValueCountFrequency (%)
1/14/2019 18
 
5.8%
2/18/2019 12
 
3.9%
1/21/2019 10
 
3.2%
1/28/2019 9
 
2.9%
2/25/2019 9
 
2.9%
1/17/2019 8
 
2.6%
1/7/2019 7
 
2.3%
1/30/2019 6
 
1.9%
2/11/2019 6
 
1.9%
2/22/2019 6
 
1.9%
Other values (127) 220
70.7%

Length

2022-12-18T00:46:18.051203image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1/14/2019 18
 
5.8%
2/18/2019 12
 
3.9%
1/21/2019 10
 
3.2%
1/28/2019 9
 
2.9%
2/25/2019 9
 
2.9%
1/17/2019 8
 
2.6%
1/7/2019 7
 
2.3%
2/22/2019 6
 
1.9%
1/25/2019 6
 
1.9%
1/31/2019 6
 
1.9%
Other values (127) 220
70.7%

Most occurring characters

ValueCountFrequency (%)
/ 622
23.0%
1 615
22.8%
2 554
20.5%
0 342
12.7%
9 224
 
8.3%
5 76
 
2.8%
4 75
 
2.8%
3 65
 
2.4%
8 45
 
1.7%
7 43
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2079
77.0%
Other Punctuation 622
 
23.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 615
29.6%
2 554
26.6%
0 342
16.5%
9 224
 
10.8%
5 76
 
3.7%
4 75
 
3.6%
3 65
 
3.1%
8 45
 
2.2%
7 43
 
2.1%
6 40
 
1.9%
Other Punctuation
ValueCountFrequency (%)
/ 622
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2701
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 622
23.0%
1 615
22.8%
2 554
20.5%
0 342
12.7%
9 224
 
8.3%
5 76
 
2.8%
4 75
 
2.8%
3 65
 
2.4%
8 45
 
1.7%
7 43
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2701
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 622
23.0%
1 615
22.8%
2 554
20.5%
0 342
12.7%
9 224
 
8.3%
5 76
 
2.8%
4 75
 
2.8%
3 65
 
2.4%
8 45
 
1.7%
7 43
 
1.6%

DaysLateLast30
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.414791
Minimum0
Maximum6
Zeros278
Zeros (%)89.4%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:18.229794image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.2945194
Coefficient of variation (CV)3.1208956
Kurtosis8.8305232
Mean0.414791
Median Absolute Deviation (MAD)0
Skewness3.1434676
Sum129
Variance1.6757805
MonotonicityNot monotonic
2022-12-18T00:46:18.412524image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 278
89.4%
4 8
 
2.6%
2 6
 
1.9%
5 6
 
1.9%
3 6
 
1.9%
6 6
 
1.9%
1 1
 
0.3%
ValueCountFrequency (%)
0 278
89.4%
1 1
 
0.3%
2 6
 
1.9%
3 6
 
1.9%
4 8
 
2.6%
5 6
 
1.9%
6 6
 
1.9%
ValueCountFrequency (%)
6 6
 
1.9%
5 6
 
1.9%
4 8
 
2.6%
3 6
 
1.9%
2 6
 
1.9%
1 1
 
0.3%
0 278
89.4%

Absences
Real number (ℝ)

Distinct20
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.237942
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2022-12-18T00:46:18.581698image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median10
Q315
95-th percentile19
Maximum20
Range19
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.8525959
Coefficient of variation (CV)0.57165745
Kurtosis-1.301962
Mean10.237942
Median Absolute Deviation (MAD)5
Skewness0.029283457
Sum3184
Variance34.252878
MonotonicityNot monotonic
2022-12-18T00:46:18.767237image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
4 23
 
7.4%
16 23
 
7.4%
7 21
 
6.8%
2 21
 
6.8%
15 20
 
6.4%
13 17
 
5.5%
14 17
 
5.5%
3 16
 
5.1%
19 16
 
5.1%
6 16
 
5.1%
Other values (10) 121
38.9%
ValueCountFrequency (%)
1 14
4.5%
2 21
6.8%
3 16
5.1%
4 23
7.4%
5 12
3.9%
6 16
5.1%
7 21
6.8%
8 11
3.5%
9 14
4.5%
10 10
3.2%
ValueCountFrequency (%)
20 14
4.5%
19 16
5.1%
18 8
 
2.6%
17 15
4.8%
16 23
7.4%
15 20
6.4%
14 17
5.5%
13 17
5.5%
12 8
 
2.6%
11 15
4.8%

Interactions

2022-12-18T00:46:04.178815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:48.777649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:50.897085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:52.981506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:55.058570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:56.789023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:58.139019image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:59.693369image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:01.212004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:02.632981image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:04.316647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:49.043296image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:51.110186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:53.166895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:55.223141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:56.932817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:58.306701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:59.857554image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:01.346394image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:02.784268image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:04.442065image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:49.263702image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:51.329344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:53.373634image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:55.407425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:57.067573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:58.429110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:59.997259image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:01.482846image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:03.000863image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:04.561672image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:49.452356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:51.497520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:53.522436image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:55.586544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:57.191607image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:58.558045image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:00.151947image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:01.624696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:03.136863image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:04.687491image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:49.682657image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:51.721566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:53.691402image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:55.743594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:57.325918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:58.692624image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:00.297781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:01.764402image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:03.264489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:04.813208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:49.855423image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:51.910389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:53.893312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:55.906387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:57.455209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:58.856987image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:00.450858image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:01.904420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:03.404642image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:04.972562image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:50.058724image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:52.144716image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:54.320952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:56.075928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:57.588584image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:59.000802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:00.609679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:02.040471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:03.543047image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:05.113686image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:50.315000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:52.334920image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:54.491163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:56.235066image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:57.731784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:59.145604image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:00.770507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:02.189308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:03.747334image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:05.287530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:50.493117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:52.570481image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:54.659391image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:56.406376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:57.867755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:59.289405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:00.919532image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:02.350263image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:03.915026image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:05.473500image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:50.701355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:52.755190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:54.854932image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:56.619682image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:57.999361image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:45:59.421584image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:01.058190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:02.486011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-12-18T00:46:04.038625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-12-18T00:46:19.019677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-18T00:46:19.629061image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-18T00:46:20.036063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-18T00:46:20.397450image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-18T00:46:20.831719image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-12-18T00:46:21.320195image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-18T00:46:05.954206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-18T00:46:06.976787image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-12-18T00:46:07.359062image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Employee_NameEmpIDMarriedIDMaritalStatusIDGenderIDEmpStatusIDDeptIDPerfScoreIDFromDiversityJobFairIDSalaryTermdPositionIDPositionStateZipDOBSexMaritalDescCitizenDescHispanicLatinoRaceDescDateofHireDateofTerminationTermReasonEmploymentStatusDepartmentManagerNameManagerIDRecruitmentSourcePerformanceScoreEngagementSurveyEmpSatisfactionSpecialProjectsCountLastPerformanceReview_DateDaysLateLast30Absences
0Adinolfi, Wilson K10026001154062506019Production Technician IMA196007/10/83MSingleUS CitizenNoWhite7/5/2011NaNN/A-StillEmployedActiveProductionMichael Albert22.0LinkedInExceeds4.60501/17/201901
1Ait Sidi, Karthikeyan100841115330104437127Sr. DBAMA214805/05/75MMarriedUS CitizenNoWhite3/30/20156/16/2016career changeVoluntarily TerminatedIT/ISSimon Roup4.0IndeedFully Meets4.96362/24/2016017
2Akinkuolie, Sarah10196110553064955120Production Technician IIMA181009/19/88FMarriedUS CitizenNoWhite7/5/20119/24/2012hoursVoluntarily TerminatedProductionKissy Sullivan20.0LinkedInFully Meets3.02305/15/201203
3Alagbe,Trina10088110153064991019Production Technician IMA188609/27/88FMarriedUS CitizenNoWhite1/7/2008NaNN/A-StillEmployedActiveProductionElijiah Gray16.0IndeedFully Meets4.84501/3/2019015
4Anderson, Carol10069020553050825119Production Technician IMA216909/08/89FDivorcedUS CitizenNoWhite7/11/20119/6/2016return to schoolVoluntarily TerminatedProductionWebster Butler39.0Google SearchFully Meets5.00402/1/201602
5Anderson, Linda10002000154057568019Production Technician IMA184405/22/77FSingleUS CitizenNoWhite1/9/2012NaNN/A-StillEmployedActiveProductionAmy Dunn11.0LinkedInExceeds5.00501/7/2019015
6Andreola, Colby10194000143095660024Software EngineerMA211005/24/79FSingleUS CitizenNoWhite11/10/2014NaNN/A-StillEmployedActiveSoftware EngineeringAlex Sweetwater10.0LinkedInFully Meets3.04341/2/2019019
7Athwal, Sam10062041153059365019Production Technician IMA219902/18/83MWidowedUS CitizenNoWhite9/30/2013NaNN/A-StillEmployedActiveProductionKetsia Liebig19.0Employee ReferralFully Meets5.00402/25/2019019
8Bachiochi, Linda10114000353147837019Production Technician IMA190202/11/70FSingleUS CitizenNoBlack or African American7/6/2009NaNN/A-StillEmployedActiveProductionBrannon Miller12.0Diversity Job FairFully Meets4.46301/25/201904
9Bacong, Alejandro10250021133050178014IT SupportMA188601/07/88MDivorcedUS CitizenNoWhite1/5/2015NaNN/A-StillEmployedActiveIT/ISPeter Monroe7.0IndeedFully Meets5.00562/18/2019016
Employee_NameEmpIDMarriedIDMaritalStatusIDGenderIDEmpStatusIDDeptIDPerfScoreIDFromDiversityJobFairIDSalaryTermdPositionIDPositionStateZipDOBSexMaritalDescCitizenDescHispanicLatinoRaceDescDateofHireDateofTerminationTermReasonEmploymentStatusDepartmentManagerNameManagerIDRecruitmentSourcePerformanceScoreEngagementSurveyEmpSatisfactionSpecialProjectsCountLastPerformanceReview_DateDaysLateLast30Absences
301Wilber, Barry10048111553055140119Production Technician IMA232409/09/65MMarriedEligible NonCitizenNoWhite5/16/20119/7/2015unhappyVoluntarily TerminatedProductionAmy Dunn11.0WebsiteFully Meets5.00302/15/201507
302Wilkes, Annie10204020553058062119Production Technician IMA187607/30/83FDivorcedUS CitizenNoWhite1/10/20115/14/2012Another positionVoluntarily TerminatedProductionKetsia Liebig19.0Google SearchFully Meets3.60502/6/201109
303Williams, Jacquelyn10264000553159728119Production Technician IMA210910/02/69FSingleUS CitizenYesBlack or African American1/9/20126/27/2015relocation out of areaVoluntarily TerminatedProductionKetsia Liebig19.0Diversity Job FairFully Meets4.30406/2/2014016
304Winthrop, Jordan10033001554070507120Production Technician IIMA204511/07/58MSingleUS CitizenNoWhite1/7/20132/21/2016retiringVoluntarily TerminatedProductionBrannon Miller12.0LinkedInExceeds5.00301/19/201607
305Wolk, Hang T10174000153060446020Production Technician IIMA230204/20/85FSingleUS CitizenNoWhite9/29/2014NaNN/A-StillEmployedActiveProductionDavid Stanley14.0LinkedInFully Meets3.40402/21/2019014
306Woodson, Jason10135001153065893020Production Technician IIMA181005/11/85MSingleUS CitizenNoWhite7/7/2014NaNN/A-StillEmployedActiveProductionKissy Sullivan20.0LinkedInFully Meets4.07402/28/2019013
307Ybarra, Catherine10301000551048513119Production Technician IMA245805/04/82FSingleUS CitizenNoAsian9/2/20089/29/2015Another positionVoluntarily TerminatedProductionBrannon Miller12.0Google SearchPIP3.20209/2/201554
308Zamora, Jennifer10010000134022045006CIOMA206708/30/79FSingleUS CitizenNoWhite4/10/2010NaNN/A-StillEmployedActiveIT/ISJanet King2.0Employee ReferralExceeds4.60562/21/2019016
309Zhou, Julia1004300013308929209Data AnalystMA214802/24/79FSingleUS CitizenNoWhite3/30/2015NaNN/A-StillEmployedActiveIT/ISSimon Roup4.0Employee ReferralFully Meets5.00352/1/2019011
310Zima, Colleen10271040153045046019Production Technician IMA173008/17/78FWidowedUS CitizenNoAsian9/29/2014NaNN/A-StillEmployedActiveProductionDavid Stanley14.0LinkedInFully Meets4.50501/30/201902